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We review two powerful methods to test the Gaussianity of the cosmic microwave 
background (CMB): one based on the distribution of spherical wavelet coefficients and 
t^J- ■ the other on smooth tests of goodness-of-fit. The spherical wavelet families proposed 

to analyse the CMB are the Haar and the Mexican Hat ones. The latter is preferred 
for detecting non-Gaussian homogeneous and isotropic primordial models containing 

■ some amount of skewness or kurtosis. Smooth tests of goodness-of-fit have recently 
been introduced in the field showing some interesting properties. We will discuss the 
smooth tests of goodness-of-fit developed by Rayner and Best for the univariate as well 

■ as for the multivariate analysis. 



1 Introduction 
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Establishing the statistical character of the CMB fluctuations is one of the most fundamental 
problems in cosmology. The simplest inflationary theories predict Gaussian fluctuations 
whereas non-standard inflation and topological defects predict different non-Gaussian ones. 
Recent sensitive CMB observations (Boomerang, MAXIMA, DASI, WMAP) have shown no 
evidence of departures from Gaussianity up to date. This fact has put strong constraints on 
the amount of cosmic strings in the universe and on the non-linear coupling parameter in 
the case of a quadratic potential (Spergel et al. 2003). 

There is not a unique way to search for non-Gaussianity in the CMB. Different features 
will be best probed by methods which are well adapted to point them. The methods can 
work in real space and also in Fourier, wavelet or other spaces in which the non-Gaussian 
features can be more enhanced. In this work we will focus on two powerful methods to search 
for non-Gaussianity: one based on spherical wavelets and the other on smooth goodness-of- 
fit tests. Only two spherical wavelet families have been considered in CMB analyses: Haar 
and Mexican Hat. Some of their properties as well as their potentiality to test the normality 
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will be discussed. The smooth tests of goodness-of-fit have been shown to be very powerful 
in testing the Gaussianity of data in many fields outside the CMB. We here introduce the 
tests in the CMB context and explain how they can be applied in a satisfactory way. We 
will first consider the univariate case, already introduced in Cayon et al. (2003b), and later 
the multivariate one. The problematics concerning their application to the CMB sky will 
be discussed. We will also review some recent results obtained by applying the previous 
methods to different experiments like COBE-DMR and MAXIMA. 

2 Spherical wavelets 

Although several works have already been performed using planar wavelets to analyse the 
CMB temperature fluctuations in small patches of the sky, it is clear that a proper analysis 
of any area of the sky will involve spherical wavelets. The latter have been developed 
recently. Schroder and Sweldens (1995) introduced the spherical Haar wavelets (SHW) as a 
generalization of the planar ones to the pixelised sphere. They are orthogonal and adapted to 
a given pixelization of the sky which must be hierarchical. Two applications of the spherical 
Haar wavelets have already been performed to analyse CMB maps. Tenorio et al. (1999) 
applied them to a QuadCube pixelization of the CMB sky for which a correction of the pixel 
area has to be made. Barreiro et al. (2000) tested the Gaussianity of the COBE-DMR 
data on the HEALPix equal-area pixelation for which no correction is required. They found 
COBE-DMR data consistent with Gaussianity. 

An extension of isotropic wavelets to the sphere has been proposed by Antoine and 
Vandergheynst (1998) following a group theory approach. It is based on the stereographic 
projection which translates from the plane to the sphere the following four basic properties: 
compensation, translation, dilation and the Euclidean limit for small angles. The spherical 
Mexican Hat wavelets (SMHW) are a particular case of isotropic wavelets which can thus 
be constructed by a stereographic projection of the planar Mexican Hat ones (Martinez- 
Gonzalez et al. 2002). Cayon et al. (2001) tested the Gaussianity of the COBE-DMR 
data on the HEALPix pixelation using the skewness and kurtosis of the SMHW coflicients 
at different angular scales and also the correlations at different scales. No deviation from 
Gaussianity was detected. 

A comparative analysis of the performance of the two spherical wavelet families, SHW and 
SMHW, already proposed in the literature to analyse the CMB statistical distribution has 
been performed by Martinez-Gonzalez et al. (2002). The comparison was based on the power 
to discriminate between standard inflationary models (Gaussian) and non-Gaussian models, 
the latter consisting in small deviations of the Gaussian inflationary models introducing an 
artificially specified skewness or kurtosis through the Edgeworth expansion. The skewness 
or kurtosis present at different wavelet scales was included by a linear optimal combination 
using the Fisher discriminant. The result of this analysis is that the SMHW is much more 
sensitive to these low-order cumulants than the SHW and it is therefore preferred for the 
detection of this type of non-Gaussianity. 

More recently, the nonlinear coupling parameter /„; in the case of a quadratic term for 
the gravitational potential has been constrained with the COBE-DMR data (Cayon et al. 
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2003a). The result, f n \ < 1100 (68% confidence level) represents the strongest constraint 
obtained with those data. 



3 Smooth tests of goodness-of-fit 

In this section, the score statistic is presented. This statistic is the starting point to construct 
the Rayner and Best test (Rayner and Best 1989, 1990), which is used in the present work 
to test the univariate and multivariate Gaussian distributions. 

Given a statistical variable x and n independent realizations {x{\ (i — 1, . . . , n), we want 
to test if the probability density function of x is equal to f(x). This statistical variable 
can be a real p-dimensional vector and, thus, the method can test, for example, if / is a 
multivariate normal distribution (this is one of the cases we are interested in). Smooth 
tests are constructed to discriminate between the predetermined function f(x) and a second 
one that deviates smoothly from the former. We consider an alternative probability density 
function f(x,9) (where 9 is a parameter vector) that deviates smoothly from f(x) and 
f(x, 9q) = f(x). In other words, we want to test the null hypothesis: 9 = 9 . 

The probability density of n independent realizations {xi} is given by ]T7=i f{ x u Given 
these measurements we calculate the estimated value of 9 by means of the Maximum Like- 
lihood Method. We denote this value by 9. 

We define W such that 

Let U(9) be a vector whose components are Ui{9) = d£({xj},9)/d9i, where the log- 
likelihood is defined as l({xj}, 9) = log U?=i /(a*, 0) = £?=i log f(x h 9). 

Assuming 9 close to 9 , we expand W and U(9 ) in Taylor series around 9, and we have 

w » ^(e o) {^y l u(o tt) (2) 

This quantity is a direct measurement of the diference between 9 and 9 and therefore 
a test of the null hypothesis. We construct a quantity closely related to the last one. To 
construct this quantity we substitute —dU(9 )/d9 for its mean value 

S = U T (9 )r 1 (9 )U(9 ) (3) 

where / is a matrix of components hj{9) = (Ui(9)Uj(9)) and it can be shown that it is equal 
to —(dUi(9)/d9j). The approximation of —dUi(9)/d9j by its mean value is valid when n is 
large 1 . The S quantity is the so-called score statistic. A wider description of the previous 
development can be found in Cox and Hinkley (1974). 

iln fact hiJJ) = -{dU t {e)/d0k) = - E^i(d 2 log/fo, 9)/d9 k d9 l ) - -n(d 2 log f( Xj , 6)/d6 k d6i), where 
the last equality holds because Xj is the same statistical variable for all j. When n is large, the last mean 
value can be approximated by (l/n)Y^j = id 2 \ogf(xj,6)/d6kd0i, and then, 7^(0) w —dUi(9)/d9k when 
n — > oo. 
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3.1 Rayner-Best test: univariate Gaussian 



Let us suppose that we have a statistical variable which takes values in the real domain. 
In the work of Rayner and Best (1989,1990) it is defined an alternative probability density 
function of orden k given by 

k 

g k (x) = C(9 1 , 9 k ) exp { ^M^)f^) (4) 

i=i ' 

where C is a normalization constant and the hi functions are orthonormal on / with ho(x) = 
1. Then, the score statistic associated to the k alternative is given by 

k in 

S k = T,V? with U i = -='£h i (x j ) (5) 
i=l V n j=i 

When we want to test if / is a Gaussian function of zero mean and unit variance, the 
hi functions become the normalized Hermite-Chebishev polynomials, that is, they are equal 
to P n (x)/s n with s n = VnJ. and Pq(x) = 1, Pi(x) = x and for n > 1: P n+ i(x) = xP n {x) — 
nP n _i(x). 

Every alternative function, that is, every k value, gives a statistic S k . The statistics S k 
are given by: S 1 = n(/ti) 2 , S 2 — S 1 + n(/t 2 - l) 2 /2, S 3 — S 2 + n(jl 3 - 3/2i) 2 /6, S 4 = S 3 + 
n([/i 4 -3]-6[/i2-l]) 2 /24 and5 5 = S 4 + n(jl 5 - 10//3 + 15/ii) 2 /120, where /i Q = (£j=ixf)/n. 
Thus, the statistic S'fc is related to cumulants of order < k, and then this test is directional, 
that is, it indicates how the actual distribution deviates from Gaussianity. For example, if 
Si and S2 are small and S3 is large, then the data have a large /t 3 value and also have a large 
skewness value because of the relation between /2 3 and S3. 

When n — > 00, the S k statistic is distributed as a xt- This holds because C/j is Gaussian 
distributed when n — > 00 (sum of a large number of independent variables). 

In the case of the CMB, the data are correlated and therefore not independent. Let Xi be 
the value of the pixel i. If we perform the Cholesky decomposition of the correlation matrix 
of the data: C = LL T , and change to the variable yj = J^iLj^Xi, then these new data are 
uncorrelated with zero mean and unity deviation and then, if Gaussianity holds, they are 
independent with a N(0, 1) distribution. Then we work with these data and apply to them 
the test here described. 

In Cayon et al. (2003b), the power of these statistics is studied and the method is used to 
test the Gaussianity of the data of the MAXIMA experiment (Balbi et al. 2000, Hanany et al. 
2000). The analysis finds these data compatible with Gaussianity under these goodness-of-fit 
tests. 

These statistics are also used in Aliaga et al. (2003) to constrain the skewness and kurtosis 
of the mentioned data. In this work the skewness S and the kurtosis K are constrained to 
\S\ < 0.035 and \K\ < 0.036, at 99% confidence level. 
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3.2 Rayner-Best test: multivariate Gaussian 

In this case, the statistical variable x takes values in a p-dimensional vectorial space and we 
want to test if f(x) is a multinomial distribution of mean equal to fi and correlation matrix 
equal to S. For the sake of clarity we are going to change the notation of the vectorial 
quantities x and \i for x and jt. As it happened in the previous subsection, we need a set 
of orthonormal functions on /. Rayner and Best (1989) propose the following construction: 
the Cholesky decomposition of the correlation matrix is performed: E = LL T and the 
matrix A = L^ 1 is obtained. Thus, one constructs the new variables: y = A(x — pi). It is 
straightforward to see that the functions L n ... rp (y) = H ri (yi) ■ ■ ■ H rp (y p ) are orthonormal on 
f(x), where the H s function is the normalized Hermite-Chebishev polynomial of degree s 
and y r (r = 1, . . . ,p) is the r component of the y vector. The degree of the L rv .. r function 
is equal to r = r\ + • • ■ + r p . 

In the case of the Rayner-Best test applied to the univariate Gaussian distribution, the hi 
functions are ordered by its degree i. In that case the ordering is easy to establish, but in the 
case of the multivariate Gaussian, different L ri ... r functions can have the same degree. Thus, 
suppose some ordering has been imposed, such that the degree r functions are considered 
before the degree r+1 ones. Call this ordered system {Lf'(y)}, where s is the position in 
the ordering and r indicates the degree of the function. 

It holds that the number of functions of degree r is the combinatorial number p+r ~ l C r . 

To calculate the Sf. statistic, we start from an alternative distribution like the one shown 
in the expresion (jlj), but the h s functions are substituted by the ones. The order of the 
alternative function, that is, the index k of represents the number of functions used. 
If we use functions up to degree r, then k = YZ=i P+S ~ 1 C S = p+r C r — 1. In an analogous way 
to the univariate case, the Sk is given by 

k in 

S k = Y.V? with Vi = —= Li S \A(xj — p)) (6) 



=1 Vn j=1 



It is interesting to construct quantities like the Uf ones of the expresion (JHJ which have 
a defined degree and their addition gives the Sk statistic. To do this, in the sum of the V- 2 
terms in @ one separates and groups the terms which have the same degree, and then, if 
we go up to degree r: 



s k = ul + --- + u 2 r with u 2 s =^-Y. E4 s) (^-$)1 



(7) 

-j=l J 

In the previous expresion, for every U 2 , the sum on the i index is made over all the 
functions of degree equal to s and so U 2 is a quantity of degree s. The Xj vector is the 
sample j of our experiment. When n — > oo, Vi is Gaussian distributed (with zero mean 
and unit variance) because of the Central Limit Theorem applied to the independent {xj} 
samples. Thus V 2 ~ Xi an d U 2 ~ xl with v = P+S ~ 1 C S . If we calculate the Sk statistics, 
or the U 2 quantities, as it is shown in equation ( |7| ). it is not important the order we have 
imposed to the l\ s ' functions, the only important aspect is how many functions of degree s 
are there. If we know these functions we only have to add them to construct U 2 . 
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In the appendix, the quantities are given, for s = 1,2,3,4, and their distributions 
are shown in Figure ^ These distributions are calculated with 5000 realizations where one 
realization consists of n — 5000 3-dimensional vectors {jp — 3). Then, as n is very large, 
we see that the shown distributions are very close to the asymptotic distributions, that is, 
U? ~ xh U% ~ x\i ~ Xio anc ^ ^1 ~ Xi5- I n the Table [TJ the values of the mean and 
standard deviation (a) of the Ug distributions are shown. These values are compared with 
the values of the corresponding asymptotic distributions \\. The distributions of the S3 and 
£4 statistics are shown in the Figure 121 For S3 the obtained mean value is 18.961 and the 
standard deviation 6.259. For S4 the same quantities are 33.932 and 8.534. In the asymptotic 
case S3 ~ X19 (with mean value 19 and standard deviation 6.164) and S4 ~ X34 (with mean 
value 34 and standard deviation 8.246). 

As an example, we calculate the distributions of the statistics when the yj r are uniformly 
distributed with mean zero and standard deviation equal to one. The first moment of this 
distribution which is different from the moments of the Gaussian distribution is the kurtosis, 
that is, the moment of degree four. So, the distribution for the statistic U\ must be different 
(see equation ([11))) when we compare the uniform case with the Gaussian one. The mean 
value and the standard deviation of distributions are shown in Table El and in fact, we see 
that we can discriminate between a Gaussian distribution and a uniform distribution (in this 
case with p = 3), because the distributions of U\ are very different. Also the distributions 
of £7f are different enough for discrimination in this case. 

Finally, the question is how to apply this test when we have only one map of CMB. We 
need a large number of samples n to estimate the sums from j = 1 to n that we have in 
expressions (JHJ) to (|11|1. that is, given the initial map we need to extract independent samples. 
One way to obtain them is: first, to define the dimension p of the multinormality we want to 
test and then divide the CMB map in patches of p pixels distributed in the same way within 
each path; second, to try to decorrelate them. Assuming that all the data are multinormal 
then the patches can be considered as independent samples after decorrelation. In this way 
we can apply the method described above, work is in progress to apply it to WMAP. 
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Appendix 

In this appendix the U 2 quantities for the multinormal distribution ( section f3.2|) are explicitly 
given for s = 1, 2, 3, 4. We consider an experiment consisting in n p-dimensional vectors, where 
the r component of the vector j is denoted by yj r (r = 1, . . . ,p). From equation (|7J) and 
taking the explicit forms of the normalized Hermite-Chebishev functions we can obtain the 
following expresions: 



1 p r n ~\ 2 

«i = EE* 

'* r=l k j=l J 

p , n s 2 P-1 V ( n *. 

£{ £(&-!)} +E E {£ vms] 



u 2 



r=l j=l r=l s=r+l j=l 

Pen n 2 1 P P e n 



(9) 



1 rl P ( n n 2 1 P P r n s'2 

r=l jf=l J z r=l s=l L j=l J 

p-2 p-1 p , n n 2 

E E E E VjrVj.Vjt \ 



r=l s=r+l v j=l 



(10) 





m xi 


ui xl 


ui Xto 


ul xh 


mean 
a 


2.966 3.000 
2.480 2.449 


5.968 6.000 
3.462 3.464 


10.031 10.000 
4.551 4.472 


15.012 15.000 
5.763 5.477 



Table 1: Values of the mean and standard deviation (a) of the distributions of C/| for 5000 
realizations of an experiment consisiting in n = 5000 samples of a data vector of dimension 
p = 3 multinormally distributed whose components are independent. These values are 
compared with the values of the corresponding asymptotic distributions xl- The relation 
between p, s and u is v — p+s ~ l C s . 





U?(g) t/?(u) 


Ul(g) C/|(u) 


ui(g) ui(u) 


Ul(g) Ul(u) 


mean 
a 


2.966 2.975 
2.480 2.428 


5.968 4.192 
3.462 2.601 


10.031 4.353 
4.551 2.140 


15.012 904.9 
5.763 38.67 



Table 2: Values of the mean and standard deviation (a) of the distributions of U% for 5000 
realizations of an experiment consisting in n = 5000 samples of a data vector of dimension p = 
3 multinormally distributed (g) and uniformly distributed (u) (each component is uniformly 
distributed). The components of the data vectors have zero mean, unit variance and are 
uncorrelated. 



U 



1 P ( n -.2 1 P P f n -.2 

^£{E(j&-6vJr + 3)} + EE E(i-3y,)U + 

^ r =l k 7=1 ' U r=l s=l k 1=1 ' 

i P-1 P ( n 2 i P P-1 P r n s 2 

z£ E {£(j&-i)(j&-i)} + EE E E(i-%* + 

^ r =l s=r+l k 7=1 ' r=l s=l t=s+l ^ 7=1 } 



(V) (t^r) 



p— 3 p— 2 p— 1 



EE E E E yjryjsyjty jq \ 

r=l s=r+l t=s+l q=t+l 7=1 J 



In order to establish the validity of the multinormal hypothesis we need to know the 
distribution for each of the statistics and also their combination to get With these 
distributions we can know if a given data set si compatible with multinormality. The 
distributions for the case n = 5000 and p = 3 are shown in Figure and Table Q compares 
the values of the mean and the standard deviation (er) obtained for 5000 realizations with 
the asymptotic values. The mean values of all are equal to v independently of the n value 
(u = P+S ~ 1 C S ). The distributions of S3 and S4 are shown in Figure El 
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Figure 1: The U 2 a distributions obtained from 5000 realizations of experiments of n — 5000 
vectors of dimension p = 3. From left to right, top to bottom s = 1, 2, 3, 4. 
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Figure 2: The S3 and £4 distributions obtained from 5000 realizations of experiments of 
n = 5000 vectors of dimension p = 3. 
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